390

31

The Organization of Knowledge

producing papers to increase exacerbates the challenge since reviewing a paper is

usually accorded a lower priority than writing one. All that can be hoped for perhaps

is that the most important results at least are properly incorporated into the edifice of

reliable knowledge, but this begs the question of how to define “importance”, which

is often difficult to perceive in advance of what is subsequently done with the results.

Another difficulty is that researchers do not always want to publish their work in

what might seem to be the most appropriate journal regarding discipline: journals

covering a broad range of fields and carrying a large number of advertisements tend

to be disproportionately popular among scientists at present, often to the neglect

of the journals published by learned societies, even those of which the authors are

members. Work of an interdisciplinary nature is especially problematical, and is

often rejected by journals devoted to the disciplines between which the work falls,

not least because reviewers may lack the breadth of knowledge to properly appraise

the work.

Nevertheless, the great progress in the sophistication of Internet search engines—

even general purpose ones like Google are effective—and the availability on the

Internet of at least abstracts of nearly all papers, even those published in journals

that formerly might have been deemed to be “obscure”, means that, despite its vast

size, the literature is now more accessible than perhaps ever before. Largely thanks

to CrossRef’s “digital object identifier” (DOI) associated with almost every paper,

we now have an efficient system of distribution of papers to everyone who needs

them, as discussed by Bernal just before the 2nd World War. 13 Very few journals are

now printed for browsing in libraries or individual subscribers and the days of the

postal distribution of paper reprints by their authors are past; a scientist can almost

instantly find and access whatever is needed from laboratory or study.

A complicating feature is the emergence, and rapid growth, of “open access”

journals. While many are available only online and hence much cheaper to produce

than conventional printed journals, nevertheless some costs are incurred, and these

are financed by article processing charges, which are fees charged to authors upon

acceptance of a manuscript. This creates a pernicious conflict of interest for the pub-

lishers: 14 whereas the number of subscriptions to a conventionally financed journal

will depend on the quality of its content, the income of an open-access publisher

is proportional to the number of papers accepted and published. The (commercial)

publisher is, therefore, directly motivated to publish as many papers as possible and

an easy way to achieve that is to abandon the traditions and obligations of honest and

rigorous peer review, and undertake much more perfunctory editing than is customary

in the case of a traditional journal. 15

Given these difficulties, it is not surprising that literature mining is presently

carried out in a rather restricted fashion, such as merely searching for all mentions

13 Bernal (1967), Chap. XI, especially p. 295 (the work was originally published in 1939).

14 Beall (2014).

15 A wealthy learned society, with an income derived from other sources, could decide to publish

its journal at its own expense. In any case, subscriptions to learned society journals are often much

cheaper than those to commercial ones, but the former might be less fashionable than the latter.